Audio-Visual Speaker Identification via Adaptive Fusion Using Reliability Estimates of Both Modalities

نویسندگان

  • Niall A. Fox
  • Brian A. O'Mullane
  • Richard B. Reilly
چکیده

An audio-visual speaker identification system is described, where the audio and visual speech modalities are fused by an automatic unsupervised process that adapts to local classifier performance, by taking into account the output score based reliability estimates of both modalities. Previously reported methods do not consider that both the audio and the visual modalities can be degraded. The visual modality uses the speakers lip information. To test the robustness of the system, the audio and visual modalities are degraded to emulate various levels of train/test mismatch; employing additive white Gaussian noise for the audio and JPEG compression for the visual signals. Experiments are carried out on a large augmented data set from the XM2VTS database. The results show improved audio-visual accuracies at all tested levels of audio and visual degradation, compared to the individual audio or visual modality accuracies. For high mismatch levels, the audio, visual, and autoadapted audio-visual accuracies are 37.1%, 48%, and 71.4% respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Speaker Identification using Adaptive Decision Fusion with Reliability Weighted Summation

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, the so called product rule with a novel adaptive reliability based weighting structure is employed. The proposed adaptive product rule is more robust in the presence of unreliable modalities, provided that the employed r...

متن کامل

Adaptive classifier cascade for multimodal speaker identification

We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, we propose a new adaptive cascade rule that favors reliable modality combinations through a cascade of classifiers. The order of the classifiers in the cascade is adaptively determined based on the reliability of each mo...

متن کامل

Multimodal speaker/speech recognition using lip motion, lip texture and audio

We present a new multimodal speaker/speech recognition system that integrates audio, lip texture and lip motion modalities. Fusion of audio and face texture modalities has been investigated in the literature before. The emphasis of this work is to investigate the benefits of inclusion of lip motion modality for two distinct cases: speaker and speech recognition. The audio modality is represente...

متن کامل

Improved Speech Recognition using Adaptive Audio-visual Fusion via a Stochastic Secondary Classifier

The adaptive fusion of video and audio is one of the fundamental pursuits of audio visual speech recognition (AVSR). In this paper the use of a high dimensional secondary classijier on the word likelihood scores from both the audio and video modalities is investigated fo r the purposes of adaptive fusion. Results are presented that lie above or equal to the boundary of catastrophic fusion acros...

متن کامل

Weight Estimation for Audio-Visual Multi-level Fusion in Bimodal Speaker Identification

This paper investigates the estimation of fusion weights under varying acoustic noise conditions for audio-visual multi-level hybrid fusion strategy in speaker identification. The multi-level fusion combines model level and decision level fusion via dynamic Bayesian networks (DBNs). A novel methodology known as support vector regression (SVR) is utilized to estimate the fusion weights directly ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005